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from a distributed control 
system (DCS) (10) as found 
in a manufacturing process; 
the device controller ensures 
that a contiguous stream of 
data from the DCS is provided 
to the predictive device at a 
synchronous discrete base 
sample time. In prediction 
mode, the device controller 
operates the predictive device 
once per base sample time and 
receives the output from the 
predictive device through path 
(14). In horizon mode and 

reverse horizon mode, the device controller operates the predictive device additionally many times during base sample time interval. In 
horizon mode, additional data is provided through path (52). In reverse horizon mode data is passed in a reverse direction through the 
device, utilizing information stored during horizon mode, and returned to the device controller through path (66). In the forward modes, the 
data are passed to a series of preprocessing units (20) which convert each input variable (18) from engineering units to normalized units. 
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NON-LINEAR DYNAMIC PREDICTIVE DEVICE 

I. FIELD OF THE INVENTION 

The present invention pertains to a predictive device that models the dynamic 
5 input/output relationships of a physical process, particularly in the process industries 
such as hydrocarbons, polymers, pulp and paper, and utilities. The predictive device 
is primarily for multivariable process control, but is also applicable to dynamic 
process monitoring, or to provide a continuous stream of inferred measurements in 
place of costly or infrequent laboratory or analyzer measurements. 

10 II. BACKGROUND OF THE INVENTION 

Mosi existing industrial products designed for multivariable model predictive 
conirol (MPC) employ linear step-response models or finite impulse response (FIR) 
models. These approaches result in over-parameterization of the models (Qin and 
BadgwclU 1 996). For example, the dynamics of a first order single input/single 

1 5 output SISO process which can be represented with only three parameters (gain, time 
constant and dead-time) in a parametric form typically require from 30 to 120 
coefficients to describe in a step-response or FIR model. This over-parameterization 
problem is exacerbated for non-linear models since standard non-parametric 
approaches, such as Volterra series, lead to an exponential growth in the number of 

20 parameters to be identified. An alternative way to overcome these problems for non- 
linear systems is the use of parametric models such as input-output Nonlinear Auto- 
Regressive with exogenous inputs (NARX). Though NARX models are found in 
many case-studies, a problem with NARX models using feed forward neural 
networks is that they offer only short-term predictions (Su, et al, 1992). MPC 

25 controllers require dynamic models capable of providing long-term predictions. 

Recurrent neural networks with internal or external feedback connections provide a 
better solution to the long-term prediction problem, but training such models is very 
difficult. 
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The approach described in (Graettinger, et al, 1994) and (Zhao, et al, 1997) 
provides a partial solution to this dilemma. The process model is identified based on 
a set of decoupled first order dynamic filters. The use of a group of first order 
dynamic filters in the input layer of the model enhances noise immunity by 
5 eliminating the output interaction found in NARX models. This structure 

circumvents the difficulty of training a recurrent neural network, while achieving 
good long-term predictions. However, using this structure to identify process 
responses that are second order or higher can result in over sensitive coefficients and 
in undesirable interactions between the first order filters. In addition, this approach 

1 0 usually results in an oversized model structure in order to achieve sufficient 
accuracy, and the model is not capable of modeling complex dynamics such as 
oscillatory effects. In the single input variable case, this first order structure is a 
special case of a more general nonlinear modeling approach described (Sentoni et al., 
1996) that is proven to be able to approximate any discrete, causal, time invariant, 

1 5 nonlinear SISO process with fading memory. In this approach a Laguerre expansion 
creates a cascaded configuration of a low pass and several identical band pass first 
order filters. One of the problems of this approach is that may it require an 
excessively large degree of expansion to obtain sufficient accuracy. Also, it has not 
been known until now how to extend this methodology in a practical way to a multi- 

20 input system. 

This invention addresses many essential issues for practical non-linear 
multivariable MPC. It provides the capability to accurately identify non-linear 
dynamic processes with a structure that 

• has close to minimum parameterization 

25 • can be practically identified with sufficient accuracy 

• makes good physical sense and allows incorporation of process knowledge 

• can be proven to identify a large class of practical processes 

• can provide the necessary information for process control 
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III. SUMMARY OF THE INVENTION 

The present invention is a dynamic predictive device that predicts or 
estimates values of process variables that are dynamically dependent on other 
measured process variables. This invention is especially suited to application in a 
5 model predictive control (MPC) system. The predictive device receives input data 
under the control of an external device controller. The predictive device operates in 
either configuration mode or one of three runtime modes - prediction mode, horizon 
mode, or reverse horizon mode. 

The primary runtime mode is the prediction mode. In this mode, the input 
10 data are such as might be received from a distributed control system (DCS) as found 
in a manufacturing process. The device controller ensures that a contiguous stream of 
data from the DCS is provided to the predictive device at a synchronous discrete base 
sample time. The device controller operates the predictive device once per base 
sample time and receives the prediction from the output of the predictive device. 

15 After the prediction mode output is available, the device controller can switch 

to horizon mode in the interval before the next base sample time. The predictive 
device can be operated many times during this interval and thus the device controller 
can conduct a series of experimental scenarios in which a sequence of input data can 
be specified by the device controller. The sequence of input data can be thought of as 

20 a data path the inputs will follow over a forward horizon. The sequence of 

predictions at the output of the controller is a predicted output path over a prediction 
horizon and is passed to the device controller for analysis, optimization, or control. 
The device controller informs the predictive device at the start of an experimental 
path and synchronizes the presentation of the path with the operation of the device. 

25 Internally, horizon mode operates exactly the same way as prediction mode, except 
that the dynamic states are maintained separately so that the predictive device can 
^resume normal prediction mode operation at the next base sample time. In addition, 
the outputs of the filter units are buffered over the course of the path and are used 
during reverse horizon operation of the device. 
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The purpose of reverse horizon mode is to obtain the sensitivities of the 
predictive device to changes in an input path. Reverse horizon mode can only be set 
after horizon mode operation has occurred. The device controller first informs the 
predictive device the index of the point in the output path for which sensitivities are 
5 required. The device controller then synchronizes the reverse operation of the 

predictive device with the output of sensitivity data at the input paths of the device. 

In forward operation, each input is scaled and shaped by a preprocessing unit 
before being passed to a corresponding delay unit which time-aligns data to resolve 
dead time effects such as pipeline transport delay. Modeling dead-times is an 

1 0 important issue for an MPC system. In practical MPC, prediction horizons are 

usually set large enough so that both dynamics and dead-time effects are taken into 
account; otherwise the optimal control path may be based on short term information, 
and the control behavior may become oscillatory or unstable. In the preferred 
embodiment, the predictive device is predicting a single measurement, and the dead- 

15 time units align data relative to the time of that measurement. If predictions at several 
measurement points are required, then several predictive devices are used in parallel. 
During configuration mode, the dead times are automatically estimated using 
training data collected from the plant. In the preferred embodiment the training 
method consists of constructing individual auto-regressive models between each 

20 input and the output at a variety of dead-times, and choosing the dead time 

corresponding to the best such model. As with other components of the invention, 
manual override of the automatic settings is possible and should be used if there is 
additional process knowledge that allows a more appropriate setting. 

Each dead time unit feeds a dynamic filter unit. The dynamic filter units are 
25 used to represent the dynamic information in the process. Internally the dynamic 

filter units recursively maintain a vector of states. The states derive their values from 
states at the previous time step and from the current input value. This general filter 
type can be represented by what is known to those skilled in the art as a discrete state 
space equation. The preferred embodiment imposes a much-simplified structure on 
30 the filter unit that allows for fast computation for MPC and also allows intelligent 
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override of the automatic settings. This simplified structure is composed of first and 
second order loosely coupled subfilters, only one of which receives direct input from 
the corresponding delay unit. The practical identification of this filter structure is an 
essential part of this invention. 

5 The outputs of the dynamic filter units are passed to a non-linear analyzer that 

embodies a static mapping of the filter states to an output value. The exact nature of 
the non-linear analyzer is not fundamental to this invention. It can embody a non- 
linear mapping such as a Non-linear Partial Least Squares model or a Neural 
Network, or a hybrid combination of linear model and non-linear model. The 
1 0 preferred embodiment makes use of a hybrid model. The reason for this is that a non- 
parametric non-linear model identified from dynamic data (such as a neural net) 
cannot, by its nature, be fully analyzed and validated prior to use. The non-linearity 
of the model means that different dynamic responses will be seen at different 
operating points. If the process being modeled is truly non-linear, these dynamic 
1 5 responses will be an improvement over linear dynamic models in operating regions 
corresponding to the training data, but may be erroneous in previously unseen 
operating regions. When the non-linear model is used within the context of MPC, 
erroneous responses, especially those indicating persistent and invalid gain reversals 
can create instabilities in the MPC controller. With a hybrid approach, a non-linear 
20 model is used to model the errors between the linear dynamic model and the true 
process. The hybrid dynamic model is a parallel combination of the linear dynamic 
model with the error correction model. The dynamic response of the linear model can 
be analyzed completely prior to use, since the gains are fixed and independent of the 
operating point. The process engineer can examine and approve these gains prior to 
25 closing the loop on the process and is assured of responses consistent with the true 
process. However, the linear dynamic response will be sub-optimal for truly non- 
linear processes. In online operation of the hybrid model within an MPC framework, 
the responses of the linear model and the hybrid model can be monitored 
independently and compared. In operating regions where the non-linear model shows 
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persistently poor response, control can be switched, either automatically or by the 
operator, back to the safety of the linear model. 

The output of the non-linear analyzer is passed through a postprocessing unit 
that converts the internal units to engineering units. 

5 The importance of this invention is that its structure is shown to be able to 

approximate a large class of non-linear processes (any discrete, causal, time 
invariant, nonlinear multi-input/single output (MISO) process with fading memory), 
but is still simple enough to allow incorporation of process knowledge, is 
computationally fast enough for practical non-linear MPC, and can be configured 
1 0 with sufficient accuracy in a practical manner. 

IV. BRIEF DESCRIPTION OF THE DRAWINGS 

The textual description of the present invention makes detailed reference to 
the following drawings: 

FIG 7 is an overall block diagram of the invention showing both the runtime and 
1 5 training components . 

FIG 2 shows the runtime structure of an individual preprocessing unit. 

FIG 3 shows the runtime structure of an individual delay unit. 

FIG. 4 shows the forward flow internal decomposition of an individual filter unit into 
cascaded subfilter units. 

20 FIG 5 shows the preferred forward flow structure of a primary first order subfilter 
unit. 

FIG 6 shows the preferred forward flow structure of a secondary first order subfilter 
unit and the preferred coupling with the previous subfilter in the cascade. 

FIG 7 shows the preferred forward flow structure of a primary second order subfilter 
25 unit. 

FIG 8 shows the preferred forward flow structure of a secondary second order 
subfilter unit and the preferred coupling with the previous subfilter in the cascade. 
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FIG. 9 shows a typical feedforward configuration of the non-linear analyzer. 

FIG JO shows the reverse flow configuration of the non-linear analyzer depicted in 
FIG. 9. 

FIGJ1 shows the reverse flow internal decomposition of an individual filter unit into 
5 cascaded subfilter units. 

FIG. 12 shows a method of training an individual delay unit. 

FIG. 13 shows the first order decoupled structure used at the start of each iteration of 
the preferred dynamic filter unit identification method. 

FIG 14 shows that reverse flow of data through a matrix structure can be described 
10 mathematically by forward flow of data through the transpose matrix structure. 

V. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

FIG J is an overall block diagram of the invention and its context. An 
external device controller (50) synchronizes the flow of data to and from the 
predictive device via the data paths (18), (14), and (64). The device controller also 
15 controls the mode of operation and the path stepping of the predictive device via the 
control path (54). The external device controller may also communicate with a DCS 
(10) or other data/control system both for requesting data and for requesting control 
changes to the modeled process; however the exact external context and 
configuration of the device controller is beyond the scope of this application. 

20 V.1 FORWARD RUNTIME OPERATION OF THE PREDICTION DEVICE 

The figures and equations in this detailed description refer to an index k that 
represents a data point in a sequence of data points. This index has different 
meanings depending on whether the forward operational mode of the device is 
prediction mode or horizon mode. 

25 In prediction mode data is provided at a regular sampling interval At to the 

input nodes (18) of the device. Data is passed in a forward direction through the 
device. For simplicity of notation, the sample point T 0 +kAt is denoted by the index k. 
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In horizon mode, a sequence of data representing a forward data path is 
provided to the inputs. This data path may represent a proposed path for manipulated 
variables for process control purposes, or may represent a holding of the inputs to 
constant values in order to determine the steady state output of the device. The 
5 starting point of this path is taken to be the most recent input sample provided in 
prediction mode. Index 0 represents this starting point and index k represents the k? h 
data point in this path. 

V.1.1 FORWARD RUNTIME OPERATION OF A PREPROCESSING UNIT 
Each input feeds a preprocessing unit (20) which is used to convert the 
1 0 engineering units of each data value to a common normalized unit whose lower and 
upper limits are, by preference, -1 and 1 respectively, or 0 and 1 respectively. 

The preprocessing unit can also shape the data by passing it through a non- 
linear transformation. However, the preferred embodiment uses a simple scale and 
offset as shown in FIG. 2 and equation (1): 

15 ■ u(k)=su E (k)+o (1) 

where u^fk) is the value of an input in engineering units, and u(k) is the preprocessed 
value in normalized units. The scale and offset values as stored in the configuration 
file (30 - FIG. 1) are, in general, different for each input variable, and are determined 
in the configuration mode. 

20 V.1.2 FORWARD RUNTIME OPERATION OF A DELAY UNIT 

Data flows from each preprocessing unit to a corresponding delay unit (22). 
The forward run- time operation of the delay unit (22) is shown in FIG 3 and equation 
(2). The output u d (k)(304) of an individual delay unit (300) is equal to the input u(k) 
(302) delayed by d sample times. The value of d may be different for each delay unit 

25 (22) and is retrieved from the configuration file (30 - FIG. J). This may be 
implemented as a shift register with a tap at the cf h unit. 

u d (k) = u(k-d) (2) 
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This equation can also be written in terms of the unit delay operator q l \ 
u d {k)=q- d u{k) 

V.1.3 FORWARD RUNTIME OPERATION OF THE FILTER UNITS 

5 Referring again to FIG 7, each delayed input value is passed to an individual 

filter unit (24). The general internal feedforward structure of a filter unit (24) is 
shown in FIG. 4. The general feedforward structure is composed of S cascaded 
subfilters (402, 404, 406). The first subfilter in the cascade (400) is referred to as 
the primary subfilter. Non-primary subfilters are referred to as secondary 
1 0 subfilters. All the subfilters are alike except that the primary subfilter receives no 
input from another subfilter, and the final subfilter sends no output to another 
subfilter. Now the general form of the primary subfilter will be described in detail. 

The primary subfilter maintains a vector (412) of states Xj(k) at each time k. 
An internal single time step delay unit (414) feeds the vector state to a coupling unit 

1 5 (420) and to a matrix unit (416). The matrix unit converts the delayed state vector 
(418) and feeds it to a vector addition unit (408). The input to the filter unit u d (k) is 
expanded and linearly scaled by the input coupling unit (410) to a vector of values of 
the same dimension as the state vector. The vector addition unit then combines its 
two input streams to produce the vector of states for the current time. The operation 

20 just described for the primary subfilter is conveniently described in mathematical 
matrix and column vector notation as: 

x I (^)=A 1 x 1 (A:-l)+b lU rf (A:) (3) 

Such an equation is known, to those skilled in the art, as a linear state space equation 
with a single input. If no structure is imposed on A, or b„ then further subfilters are 
25 unnecessary since the cascaded subfilter structure can subsumed into a single 
complicated primary subfilter. However, the preferred subfilter structures as 
described below, or similar to those described below, are essential for a practical 
embodiment and application of the invention. 
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The subfilter coupling unit (420) determines how state values at time k-1 
affect the state units in the next subfilter at time k. In mathematical terms, the 
subfilter coupling unit uses the coupling matrix /] to perform a linear transformation 
of state vector x } (k-l) which is passed to the vector addition unit of the next 
5 subfilter. The operation of a secondary subfilter is conveniently described in 
mathematical matrix and vector notations as: 

x s (k) = A,x, (* - 1)+ r, X ,_, (* - 1)+ b s u d (k) (4) 

In the preferred embodiment, the subfilters are all of first or second order. A 
first order subfilter maintains just one state. The preferred embodiment for a first 
10 order primary subfilter (500) is shown in FIG. 5. The vectorizing unit (502) and the 
matrix unit (504) collapse to become scaling operations so that the state vector (506) 
is represented by: 

^W-irt^ij+fi-^yW (5) 

The preferred embodiment for a first order secondary subfilter (600) is shown 
15 in FIG. 6. The secondary subfilter receives no direct input, but instead receives 
cascaded input from the previous subfilter. The preferred coupling is a loose 
coupling scheme (602) in which only the last state component of the previous 
subfilter contributes. Note that the previous subfilter is not required to be a first order 
subfilter. The state vector (606) is represented by: 

20 x,(k) = JL s x,{k-\)+{\-Jl s )x,_ lJast {k-l) (6) 

where the matrix unit X s (604) is a scalar. 

Second order subfilters maintain two states. The preferred embodiment for a 
second order primary subfilter (700) is shown in FIG. 7. In this figure, the state 
vector x t (k) is shown in terms of its two components x tl (k) (708) and x n (k) (710). The 
25 vectorizing unit (702) creates two inputs to the vector addition unit (714), the second 
of which is fixed at zero. The delayed states (704) and (706) are fed to the matrix unit 
(712) whose outputs are also fed to the vector addition unit (712) which adds the 
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matrix transfomied states to the vectorized inputs producing the current state. Note 
that due to the (1,0) structure of the second matrix row, and the zero second 
component of the vectorizing unit component, the current second state component 
(710) is just equal to the delayed first component (704): 

*u(*)-*h(*-l) 

The preferred embodiment for a second order secondary subfilter (800) is 
shown in FIG. 8. In this figure, the state vector x s (k) is shown in terms of its two 
components x s ,(k) (808) and x s2 (k) (810). The preferred coupling with the previous 
subfilter unit is a loose coupling scheme (802) in which only the last state component 

10 of the previous subfilter contributes to the first state component of the current 

subfilter. Note that the previous subfilter is not required to be a first order subfilter or 
second order subfilter. The output of the coupling unit is fed to the addition unit 
(814). The delayed states (804) and (806) are fed to the state matrix unit (812) whose 
outputs are also fed to the vector addition unit (812) which adds the state matrix 

1 5 transformed states to the output of the coupling unit, producing the current state. 
Note that due to the (1,0) structure of the second state matrix row, and the zero 
second row of the coupling matrix, the current second state component (810) is just 
equal to the delayed first component (804): 

20 If the device is operating in horizon mode current states along the path are 

maintained in a separate storage area so as not to corrupt the prediction mode states. 
In horizon mode, k indexes the input path and the states are initialized at the start of 
the path (k =0) to the prediction mode states. In addition the states at the output of 
the filter unit are buffered for use in reverse horizon mode. 
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V.l .4 FORWARD RUNTIME OPERATION OF THE NON-LINEAR ANALYZER 
Referring again to FIG J, the outputs (28) of the filter units (24) provide 
input to the non-linear analyzer (26). The exact structure and configuration of the 
non-linear analyzer (26) is not central to this application. It is the interaction of the 
5 non-linear analyzer (26) with the filter units (24), and the operation and configuration 
of the filter units (24) that forms the core of this invention. The preferred 
embodiment, for reasons discussed in the summary of the invention is a hybrid 
parallel combination of linear and non-linear. However, for clarity of explanation, a 
standard neural network structure is described which is well known to those skilled 
10 in the art. This structure is shown in FIG 9. The equations for this structure are: 

N 
r'-l 

tanh(# A (*)) (9) 

*(*)-f;«v7*to 

V.1.5 FORWARD RUNTIME OPERATION OF THE POSTPROCESSING UNIT 
The postprocessing unit (32) in FIG. 1 is used to scale the output from the 
normalized units to engineering units. The postprocessing unit can also shape the 
1 5 data by passing it through a non-linear transformation. However, the preferred 

embodiment uses a simple scale and offset. For consistency with the preprocessing 
units, the scale and offset represent the mapping from engineering units to 
normalized units. 

j>£ (*)=-;<*)— (10) 

20 The scale and offset values as stored in the configuration file (30 - FIG. I) 

and are determined in the configuration mode. 
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V.2 REVERSE RUNTIME OPERATION OF THE PREDICTION DEVICE 

The reverse horizon mode of operation is only allowed immediately 
following horizon mode operation. Horizon mode operation buffers the states (28) 
output by the filter units (24) over the course of the forward path. The purpose of 
5 reverse horizon mode is to obtain the sensitivity of any point y(k) of the prediction 
path (output by the device in horizon mode) with respect to any point in the input 
path u(l). 

In order to use the invention for process control applications, the 
mathematical derivatives of the prediction with respect to the inputs are required. 

1 0 The mathematical derivatives measure how sensitive a state is in response to a small 
change in an input. The dynamic nature of the predictive device means that a change 
in input- at time k will start to have an effect on the output as soon as the minimum 
dead-time has passed and will continue to have an effect infinitely into the future. In 
most practical applications systems are identified to have fading memory so that the 

1 5 effect into the future recedes with time. For MPC applications the aim is to plan a 

sequence of moves for the inputs corresponding to manipulated variables (MVs). The 
effect of these moves needs to be predicted on the controlled variables (CVs) along a 
prediction path. A constrained optimization algorithm is then used to find the move 
sequences that predict an optimal prediction path according to some desired criteria. 

20 In reverse horizon mode, the external device controller specifies the output 

path index k. The device then outputs in sequence the sensitivities (64) in reverse 
order at the input nodes of the device. In the detailed description below, the 
sensitivity of the output y E (k) of the device with respect to any variable v is 
represented by Q t v .It is this sensitivity value, rather than an external data value that 

25 is fed back through the device when operating in reverse horizon mode. 

V.2.1 REVERSE RUNTIME OPERATION OF THE POSTPROCESSING UNIT 

The reverse operation of the postprocessing unit (32) is to scale data received 
at its output node using the inverse of the feedforward scaling shown in equation 
(10): 
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n k y(k) = sn k y E (k) (11) 

Since the sensitivity of the output with respect to itself is: 

CW*)=1 . (12) 

the postprocessing unit always receives the value of 1 at its output node in reverse 
5 operation. 

V.2.2 REVERSE RUNTIME OPERATION OF THE NON-LINEAR ANALYZER 

The reverse runtime operation of a neural net model is well known to those 
skilled in the art and is shown in FIG. 10. The output from the reverse operation of 
the postprocessing unit C2 k y(k) is presented at the output node of the non-linear 

10 analyzer (26). The information flows in a reverse manner through the non-linear 
analyzer (26) and the resulting sensitivities (62) are output at the input nodes of the 
non-linear analyzer (26): 

n k 77»(k)=c h n y (k) 
Q^ A (^)=Q A7 ,(^tanh^ A (^)) 

= n^Mi - V h (*)Xi + **(*)) < 13 > 

V.2~3 REVERSE RUNTIME OPERATION OF A FILTER UNIT 

15 The effect of a change in the delayed input u d (l) on a the sequence of states 

being output from a filter unit (24) in horizon mode is complex due to the 
dependencies of a subfilter ! s states based on the previous subfilter's states and on the 
subfilter f s previous states. An efficient solution can be derived using the chain rule 
for ordered derivatives (Werbos, 1994) and is achieved by the reverse operation of 

20 the filter unit (24). In reverse horizon mode, the output of each filter unit (24) 
receives the vector of sensitivities 0. s x s (A:) propagated back from the non-linear 
analyzer (26) operating in reverse mode: 
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a k x s (k) 

Af(Q tXj (z+i))+r/ +I (n 
A T s (n k x s (i+i)) 

0 

v. 



l = k 

, k x s+l (l + l)) l<k,l£s<S 

l<k,s = S 
l>k 



(14) 



s 



n t u d {i) 



10 



15 



20 
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The operation of these equations is shown in FIG. 11, which shows the filter 
structure of FIG. 4, but with data flowing in the reverse direction. Given the point k 
in the output path for which the sensitivities are being calculated, the vector of 
sensitivities Cl k x s (k) is presented at the output channels (1120, 1122. ..1124) of the 
filter unit (24) and cycled in reverse through the filter structure. This reverse 
operation is indexed by / < k . At each iteration / , the resulting sensitivity Q. k u d (l) 

is output at the input channel (1110) of the filter unit (24). For / < k the external 
input at the output channels (1120, 11 22... 11 24) is in practice zero vector since 
Cl k x s (/) = 0 . However, the filter unit (24) itself is not constrained to operate under 
this assumption. 

In FIG. 11, the reverse operation of a delay (1130) is represented by q which 
is the unit delay in the reverse time direction since the index / is decreasing at each 
iteration. 

The reverse operation of a matrix operation (1132, 1134) or a vector 
operation (1136) is represented mathematically as the transpose of the forward 
operation. The physical justification for this is shown in FIG, 14 which shows the 
individual channels represented by a 3x2-matrix operation which in forward 
operation maps two input channels to three output channels, and in reverse operation 
maps three input channels to two output channels. 
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V.2.4 REVERSE RUNTIME OPERATION OF A DELAY UNIT 

The reverse operation of a delay unit (22) corresponds to a delay in the 
reverse sequencing: 

n k u(i)=n k u d (i+d) (15) 

5 V.2.5 REVERSE RUNTIME OPERATION OF A PREPROCESSING UNIT 

The reverse operation of a preprocessing unit (20) is to scale data received at 
its output node using the inverse of the feedforward scaling shown in equation (1): 

S 

V.3 CONFIGURATION MODE 

1 0 The predictive device is configured, in the preferred embodiment, using 

training data collected from the process. However, a process engineer can override 
any automated configuration settings. The training data set should represent one or 
more data sets which have been collected at the same base-time sample rate that will 
be used by the external device controller to present data to the predictive device in 

1 5 prediction mode. Each set of data should represent a contiguous sequence of 
representative. 

In order to allow operator approval or override of the configuration settings, - 
the training of the predictive device is done in stages, each stage representing a major 
component of the predictive device. 

20 V.3.1 CONFIGURING THE PREPROCESSING AND POSTPROCESSING UNITS 
The scale and offset of a preprocessing or postprocessing unit is determined 
from the desire to map the minimum E min and maximum E max of the corresponding 
variable's engineering units to the minimum and maximum N max of the 
normalized units: 
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W x - N- 

max nun 



E — E 

max min x-t ~\ 

/r AT _/T A/ ^ 
^ — max min min max 

~ 77 _ P 

max min 

The preferred normalized units have N min —1> N^-l. The engineering units may be 
different for each input variable, leading to a different scale and offset for each 
preprocessing/postprocessing unit. 

5 V.3.2 CONFIGURING A DELAY UNIT 

The configuration of a delay unit (22) is not a central aspect of this 
application. FIG 12 shows a simple advisory procedure for suggesting delay times. 
A process engineer can override these advisory settings. In this procedure d^ and 
^max are user settable limits for the delay time and the procedure calculates a delay 
10 time d such that 

d • <d <d 

mm ~ ** — **max 

V.3.3 CONFIGURING A FILTER UNIT 

A practical means of configuring a filter unit (24) is an essential aspect of this 
invention. The preferred method of configuration is initialized using the simplified 
15 filter structure shown in FIG. 13 in which all subfilters are first order and decoupled. 
This is the structure used in (Graettinger, et al ? 1994). It is important to note that this 
structure is used for initialization of the configuration procedure and does not 
represent the final suggested filter configuration. 

Step 1 

20 The operator specifies an appropriate dominant time constant T t associated 

with each input variable. This can be specified from engineering knowledge or 
through an automated approach such as Frequency Analysis or a Back Propagation 
Through Time algorithm. The value of the initial time constant is not critical the 
proposed configuration method automatically searches the dominant time range for 

25 the best values. 
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Steo 2 

For each input, initialize the filter structure in FIG. 13 using a high order 
system where a number of first order filters are created around the given dominant 
time constant (dominant frequency, dominant dynamics). For example, a fifth order 
system can be created using: 



A n — e 



A i2 =e 



At 



At 
0.757} 



6J 



(18) 



At 



A i4 =e x * ST > 



A i5 =e 



At 
1ST, 



In this simple filter structure, each subfilter (1302, 1304, 1306) yields a 
corresponding single state (1312, 1314, 1316) which is decoupled from the other 
subfilter states. This initial filter structure represents the equation 

10 x(k)= Ax(k-l) + Bu d {k) (19) 

which has a simplified diagonal block structure of the form 



*(*)= 



A = 



x 2 (k) 

o 



B = 



0 
0 
0 

b, 

0 
0 
0 



0 
0 



0 
0 



0 

0 
0 



0 
0 



0 

0 



(20) 
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A ; = 



b, = 



0 A l2 
0 0 
0 0 

l-A n 

l-A, 



'12 



l-A, 



Step 3 



0 
0 



0 



(21) 



Map the contiguous input training data through the delay units (22) and filter 
5 structure (24) to obtain a set of training state vectors {x(kjk = 1, • • • ,t] . Then find a 
vector c that provides the best linear mapping of the states to the corresponding 
target outputs = 1, * • • » t} . One way of doing this is to use the Partial Least 

Squares method that is well known to those skilled in the art. This results in a multi- 
input, single-output (MISO) state space system {A,b,c r } in which equations (19), 
10 (20), and (21) are supplemented by the equation: 

y(k)=c r x (22) 

where 



c = 















. c w. 




- C iS. 



(23) 



Step 4 
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Balance each subsystem {A.,b,.,cf }of the MISO block diagonal system 

based on controllability & observability theory. The balancing procedure allows 
order reduction of a state space system by transforming the states so that the 
controllability and observability properties of the original system are substantially 
5 concentrated in the first part of the state vector. 

For each input variable, indexed by /, perform the balancing procedure on the 
sub-system [A i9 b i9 cJ }. Balancing of a linear state space system is a method of 

reduction well known to those skilled in the art. Other methods of model reduction, 
such as Hankel reduction, can be substituted. A summary of the balancing method is 
10 now given. 

For each sub-system {A„b,,cf }, compute the controllability and 
observability Gramians P. > 0, Q, > 0 that satisfy the equations: 



A / P,Af-*i=-*#bf 
A, r Q,A,-Q f =-*,c, r 



(24) 



Find a matrix using the Cholesky factorization method, such that 

15 P,»R, r R f . (25) 

Using the singular value decomposition method, diagonalize to obtain the following 
decomposition: 

R,Q,.Rf = U^Uf (26) 

Define 

20 t; 1 =Rfu,.z: ,/2 (27) 

then 

T^^YQ/r^^ (28) 

and the balanced subsystem is obtained through a similarity transform on the states 
as: 
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A, = T t A g T? 1 ,b, = T,b |f c, r = cf Tf 1 (29) 

Step 5 

Using balanced subsystems find out dominant time constant for each input by 
reducing each balanced model to a first order model. This is done by considering the 
5 dynamics of all but the first state of each input's filter unit (24) to have reached 
steady state. This leads to: 

T t = — (30) 

' In(^) 

where 

fl/=flm +i S2^-A ffl )" , a ni „ (31) 

10 and 



A,- 



a i\\ a il2 
L a /2I ^ 122 J 



(32) 



Check the convergence of the dominant time constant estimation: 
If 

(33) 

15 or the number of iterations has exceeded the maximum allowable, go to step 6. 

Otherwise, return to step 2. The maximum number of iterations and e are parameters 
of the training method. 

Step 6 

Once an accurate estimate of the dominant time constant is available for each 
20 input variable, the eigenvalues \d?\s = 1,- ■ - ,5} of the controllability gramian P. 
(equivalently the observability gramian) are calculated; these are always positive 
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and real because the controllability gramian is positive definite. The final order Sj of 
each filter unit (24) is then calculated such that 

<0^^f (34) 

where G is parameter of the training method and is a value less than 1, a good 
5 practical value being 0.95. This order represents the total number of states of an 
individual filter unit (24). 

After determining the model order, truncate the A f . matrix so that just the first 
Si states are used; this truncation is done by selecting the upper left Spc Sj submatrix 
of A / . Then calculate the Sj eigenvalues of the truncated A,, matrix 

10 Now configure each filter unit (24) using the preferred first and second order 

subfilter configurations with the preferred couplings as shown in FIG 5 through FIG. 
8. Use a first order filter for each real eigenvalue. Use a second order filter for each 
pair of complex eigenvalues {l 9 A } , where, in FIG. 7 (equation 7) or FIG. 8 
(equation 8): 

a,, — /l+A 

15 (35) 

The preferred ordering of these subfilter units is according to time-constant, with the 
fastest unit being the primary subfilter. 

Another favored approach is to perform model reduction by initializing with 
Laguerre type filter units as described in section V.4.2, rather than the simple 
20 diagonal filter structure of FIG. 13. Sufficient quantity of Laguerre type filter units 
span the full range of dynamics in the process, and thus the iterative process 
described above is not needed. In fact a non-linear model reduction can be achieved 
by performing a linear model reduction on the linear system whose states are defined 
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by the Laguerre filters and whose outputs are defined by pre-transformed values at 
the hidden layer of the neural net: 

V.3.4 CONFIGURING THE NON-LINEAR ANALYZER 
5 The configuration of the non-linear analyzer (26) is not a central aspect of 

this application. The non-linear analyzer (26) is trained to optimally map the outputs 
of the filter units (24) to the corresponding target output. Training of a neural net is 
described in detail in (Bishop, 95) for example. 

V.4 UNIVERSALITY OF THE PREDICTION DEVICE 
10 The predictive device is shown, in this section, to be able to approximate any 

time invariant, causal, fading memory system (defined below). In order to prove this, 
some precise notation and definitions will be needed. 

V.4.1 NOTATION AND DEFINITIONS FOR UNIVERSALITY PROOF 

Let Z denote the integers, Z + the non-negative integers and Z_ the non- 
15 positive integers respectively. A variable u represents a vector or a sequence in 
accordance with the context, while u(k) represents a value of the sequence at the 
particular time k. 

For any positive integer p > 0, denotes the normed linear space of real N- 
vectors (viewed as column vectors) with norm |u| = max, sf ^ \u n | . Matrices are 

20 denoted in uppercase bold. Functions are denoted in italic lowercase if they are 
scalars and in bold if they are vector valued. 

Let (Z) (respectively /* (Z + ) and /* (Z. )), be the space of bounded R" - 
valued sequences defined on Z (respectively Z + and Z_) with the norm: 

ML =su P* e zK*0| 
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For every decreasing sequence w : Z + -» (0,l], lim vv(/:) = 0 define the 

k—*ao 

following weighted nonn: 

A function F : /£(Z_ ) -» R is called a functional on (Z_), and a function 
5 3 : (Z_ ) — > /°° (z) is called an operator. As a notational simplification the 

parentheses around the arguments of functional and operators are usually dropped; 
for example, Fu rather than F[u] and 3u(#) rather than 3[u](£) . 

Two specific operators are important. The delay operator defined by 
Q d u(k) = u(k-d) 
1 0 and the truncation operator defined by 



Pu(k) = 



\u(k) k<0 
0 k>0 



The following definitions make precise the terms used to characterize the 
class of systems approximated by the predictive device. 

Time invariant : An operator 3 is time-invariant if 

15 Q d 3 = 3Q d WeZ. 

Causality : 3 is causal if u(/) = v(/)V/ < k => 3u(Ar) = 3v(&) . 

Fading Memory: 3 : (z) -» /" (z) has fading memory on a subset 
K_ c/; (Z.) if there is a decreasing sequence w : Z + — > (0,l]> lim = 0 , such that 
for each u,vg K_ and given s > 0 there is a S > 0 such that 
20 ||u(*)-v(*| w <s => |3n(o)-3v(oJ<* 

Every sequence u in /^(Z_)can be associated with a causal extension 
sequence u c in /;(Z) defined as: 
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\u{k) k<0 
|u(o) * > 0 



and each time invariant causal operator 3 can be associated with a functional F on 
/~(Z_) defined by 

Fu = 3u c (0) 

5 The operator 3 can be recovered from its associated functional F via 

Zu{k) = FPQ- k u (36) 

Then, 3 is continuous if and only if F is, so the above equations establish a one to 
one correspondence between time invariant causal continuous operators and 
functionals F on /„(Z_). In the next the definition of the Laguerre system is given. 
10 These can be configured in the general filter structure of FIG. 4 but also have 
important theoretical properties. 

V.4.2 LAGUERRE SYSTEMS 

The set of the Laguerre systems is defined in the complex z-transform plane 

as: 



15 L, = 



, fa** 



z — a . 

j 



l—djZ 
Z — Clj . 



s = 0,1, • ■ ■ ,oo, i = l 9 -" 9 N 



where: 

4 (z) : is the Z transform of V 5 (k), the 5-th order system for the z-th input. 

a t : is the z-th input generating pole, such that \a\ < 1 . This pole is selected as 

a. = 1 , where T is the dominant time constant for the z-th input 

20 variable. 

d { : is the time delay associated with the z-th input variable. 
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7.: =l-0, 2 

The whole set of Laguerre systems can be expressed in a state space form that 
shows a decoupled input form and therefore can be mapped to the general filter 
structure in FIG 4. Each filter unit (24) is configured as a single structured {A,,B,-} 

5 subfilter. The structure of A,, is a lower triangular matrix, and b . = [l 0 • • • Of . 

The key point here is that the representation is decoupled by input. Balancing 
can be done to decrease the order of the Laguerre systems, and similarity transforms 
can be done on the Laguerre filters in order to simplify the configuration to utilize 
the preferred subfilter units. Similarity transformations do not affect the accuracy of 
10 the representation and so proving that the use of Laguerre filters decoupled by input 
approximate any time invariant, causal, fading memory system is equivalent to 
proving the preferred subfilter structure can approximate any such system. The 
balancing is a practical mechanism to reduce order without degrading performance. 

V.4.3 PROOF OF APPROXIMATION ABILITY OF LAGUERRE SYSTEMS 
15 First some preliminary results are stated: 

Stone- Weierstrass Theorem (Bovd«1985>. 

Suppose E is a compact metric space and G a set of continuous functionals 
on E that separates points, that is for any distinct u, v e E there is a G e G such that 
Gu * Gv . Then for any continuous functional F on E and given s > 0 , there are 

20 functionals, (G^-G^-^G^-GfJcG, £ = and a polynomial 
p : R 5 -» R , such that for all u e E 

The reason for the group indexing, which is not necessary for a general statement of 
the Stone-Weierstrass theorem, will become apparent in Lemma 2 when each block 
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with a Laguerre operator. In addition, three lemmas are necessary before the theorem 
can be proved. 

Lemma 1 : K_ s {u e (z. Jo < ||uj| < c, }, is compact with the ||-|| w norm. 

Proof: Let vS p) be any sequence in K_ . We will find a u (0) e K_ and a subsequence 

5 of u (/,) converging in the II w norm to u (o) . It is well know that K_ is not compact in 
/£(Z_) with the usual supremum norm (Kolmogorov, 1980). For each /, let be 
K_[-l,o] the restriction of K_ to [-/,0]. K_[-!,0] is uniformly bounded by c 5 and 
is composed of a finite set of values, hence compact in /£[-/,()]. Since i£_|-/,o] is 
compact for every /, we can find a subsequence u (p "^ of and a u (o * e AT_ j-/,o] 
10 along which converges: 

suplu^^-u^M^O as m-»oo (37) 

Now, let e> 0. Since 0 as k --> oo 5 we can find w 0 > Oa such that 

w(m 0 )< s/c : . Since u^ Pm \u^ €^„,we have that 

sup|u ( ^ ) (*)-u(° ) (^w(-A)<2c 1 >v( Wo )< ^ (3 8 ) 

15 Now from equation (37) we can find m } such that 

sup \u lPm) (k)-u {0) (k)l<£ for m>m x (39) 

so by equation (38) and equation (39) we can conclude that 
| u (pJ _ u (0) |^ < s for m>m l 

which proves that K_ is compact. 
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Lemma 2. The set of functional {g^ } associated to the discrete Laguerre Operators 
are continuous with respect to ||| w norm, that is, given any e > 0 there exists S > 0 
such that ||u - v|| w < S => |G> - G>| < * 



Consider the functional G^Q associated with the Laguerre operator 40. 
5 Given <f > 0 , chose J > 0 such that: 

K-v,L<* =* \S' t u,-Oiv t \<s (40) 

This is possible due to the continuity of the one dimensional Laguerre operators with 
respect to the weighted norm as shown in (Sentoni et al, 1996). Therefore, from 
equation (40) and the definition of the functional 

10 |ju- v|| w < S=>\u. t - v f | w < *=>|G> - G>| = \G[u t - G>, \ <s (41) 

which proves Lemma 2. 

Lemma 3. The {g' s } separate points in (Z_ ), that is, for any distinct u, v e (Z_ ) 
there is aC^G such that G> * G[v . 

Pro*?/ Suppose u, v e l„ (Z_ ) are equal except for the z'-th component. Then 

1 5 c s u * g; v o g>,. * g; v,. (42) 

by the definition of the functional. It is known from one dimensional theory 
(Sentoni et al, 1996) that for any distinct u.,v f e r(Z_) there is a G; such that 
G i s u i * G[v. ; this result together with equation (42) proves Lemma 3. 

A pproximation Theorem 

20 Now given e > 0 , Lemmas 1 , 2, 3 together with the Stone- Wei erstrass 

theorem imply that given any continuous functional F on K_ , there is a polynomial 
p : R s -> R , such that for all u e ^T. 

iFu-^G^-^Giu^-sG^-.^G^u^^ (43) 
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Because the Laguerre systems are continuous and acting on a bounded space, the 
G^u are bounded real intervals on so the polynomial p can be replaced by any static 

model that acts as a universal approximator on a bounded input space, for example, a 
neural net. In other words (43) can be replaced by 

5 |Fu-M^(G>,...,Gi i u,.. s G>,...,G,>)|<^ (44) 

A time invariant causal operator 3 can be recovered from its associated functional 
through equation (36) as 

3u(k)=FPQ- k u 
Now let u e K and k e Z , so PQ~ k u <s K_ , hence 
10 |Fi>2'*u-AW^ 

Since the last equation is true for all k e Z , we conclude that for all ue^. 
||3u-3u||<<5r 

In other words, it is possible to approximate any nonlinear discrete time 
invariant operator having fading memory on K , with a finite set of discrete Laguerre 
15 systems followed by a single hidden layer neural net. This completes the proof. 

V.5 EQUIVALENTS 

Although the foregoing details refer to particular preferred embodiments of 
the invention, it should be understood that the invention is not limited to these 
details. Substitutions and alterations, which will occur to those of ordinary skill in 
20 the art, can be made to the detailed embodiments without departing from the spirit of 
the invention. These modifications are intended to be within the scope of the present 
invention. 
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CLAIMS 

What is claimed is: 

1 . A predictive device for modeling a non-linear, causal, multiple-input single- 
5 output system or process, comprising: 

a plurality of preprocessing units for receiving a working signal including 
control data inputs, the preprocessing units normalizing the control 
data inputs, resulting in preprocessed inputs; 

a plurality of delay units coupled to the preprocessing units, the delay units 
10 time aligning the preprocessed inputs, resulting in time aligned inputs; 

a plurality of filter units coupled to the delay units, the filter units being of a 
substantially simplified configuration as compared to a configuration 
based upon discrete state space equations, the filter units filtering the 
time aligned inputs at least according to time, resulting in filtered 
15 states; 

a non-linear analyzer coupled to the filter units and accepting the filtered 
states, the non-linear analyzer generating a single analyzer output; 

a postprocessing unit coupled to the non-linear analyzer to receive the 

generated analyzer output, the postprocessing unit converting the 
20 single analyzer output to a single device output that represents an 

estimate or prediction of the output of the multiple-input single-output 
dynamic system being modeled by the device, and 

wherein the predictive device operates in a plurality of selectable modes 
including a configuration mode and multiple runtime modes. 

25 

2. The device of Claim 1 wherein: 
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the data generated by the predictive device, in any of the selectable modes, 
are received by a device controller for analysis, monitoring, 
optimization or control of the modeled process. 

5 3. The device of Claim 1 wherein the preprocessing units normalize the control data 
inputs by scaling and offsetting the control data inputs, resulting in preprocessed 
inputs. 



4. The device of Claim 1 wherein the postprocessing unit normalizes the analyzer 
10 output by scaling and offsetting the analyzer output, resulting in a postprocessed 
device output . 



5. The device of Claim 1 further comprising a plurality of training units for 
configuring the predictive device in configuration mode, the training units 
15 including: 

a preprocessing training unit to set overrideable parameters, including scale 
and offset settings, in the plurality of preprocessing units; 

a delay training unit to set overrideable delay times to the plurality of delay 
units; 

20 a filter training unit to configure the plurality of filter units; 

a non-linear analyzer training unit to train the non-linear analyzer to 
optimally map the filtered states to the analyzer output; 

a postprocessing training unit to set overrideable parameters, including scale 
and offset settings, in the plurality of postprocessing units; and 

25 wherein the training units are activated when the predictive device is operated 

in configuration mode. 
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6. The device of Claim 5 wherein the filter training unit is used to configure, in a 
practical manner allowing relatively simple override by an operator, the plurality 
of filter units by using input training data for control data input, and by using a 
corresponding measured process output for target device output training data, and 
5 by: 

(a) setting an initial estimated dominant time constant associated with each 
input training data; 

(b) for each input training data, initializing a filter structure based on the 
input training data's initial estimated dominant time constant; 

1 o (c) for each target device output training data, determining a corresponding 

target analyzer output training data 

(d) mapping input training data through the delay units and the filter 
structure, resulting in a plurality of training filter state vectors, creating a 
vector trained to linearly map training states to the corresponding target 

1 5 analyzer outputs, resulting in a multi-input, single-output state space 

system consisting of a block structure of single-input single-output sub- 
systems, one for each filter unit; 

(e) applying a method of order reduction to each single-input, single-output 
sub-system, resulting in a set of reduced order single-input, single-output 

20 " sub-systems; 

(f) reducing each reduced single-input, single-output sub-system to a first 
order system to determine an updated estimated dominant time constant 
for each input training data; 

(g) repeating steps (a) through (f) for one or more iterations using the updated 
25 estimated dominant time constants in place of the initial estimated 

dominant time constants; and 

(h) configuring each filter unit using the corresponding reduced order single- 
input, single-output sub-system. 
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7. The device of Claim 5 wherein the filter training unit is used to configure, in a 
practical manner allowing relatively simple override by an operator, the plurality 
of filter units by using input training data for control data input, and by using a 
corresponding measured process output for target device output training data, and 

5 by: 

(a) setting an estimated dominant time constant associated with each input 
training data; 

(b) for each input training data, initializing a filter structure based upon the 
input training data's estimated dominant time constant; 

10 (c) for each target device output training data, determining a corresponding 

target analyzer output training data 

(d) mapping input training data through the delay units and the filter 
structure, resulting in a plurality of training filter state vectors, creating a 
linear or non-linear analyzer trained to map training states to the 

15 corresponding target analyzer outputs, where said analyzer has a set of 

internal states which are linearly dependent on a current filter state vector, 
resulting in a linear multi-input, multi-output state space system mapping 
input training data to said internal analyzer states, where said linear multi- 
input, multi-output state space system consists of a block structure of 

20 single-input multi-output sub-systems, one for each filter unit; 

(e) applying a method of order reduction to each single-input, multi-output 
state space system, resulting in a set of reduced order single-input, multi- 
output sub-systems; and 

(f) configuring each filter unit using the corresponding reduced order single- 
25 input, multi-output sub-systems. 

8. The device of claim 7 wherein the non-linear analyzer is a neural network and the 
internal states are summation values of the neural network hidden layer. 
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9. The device of claim 7 wherein the linear or non-linear analyzer is a Partial Least 
Squares model and the internal states are latent variable scores. 

10. The device of Claim 6 or Claim 7 wherein the method of reduction is a Hankel 
5 model reduction procedure. 

1 1 . The device of Claim 6 or Claim 7 wherein the method of reduction is an 
Internally Balanced model reduction procedure. 

10 12. The device of Claim 6 or Claim 7 wherein the method of reduction is an iterative 
application of any of a variety of model reduction procedures including a Hankel 
model reduction procedure and an Internally Balanced model reduction 
procedure. 

15 13. The device of Claim 1 wherein the plurality of selectable runtime modes includes 
a predictive mode in which: 

(i) the predictive device receives a contiguous stream of control data 
inputs at asynchronous discrete base sample time; and 

(ii) the predictive device is operated once per base sample time. 

20 

14. The device of claim 13 wherein the contiguous stream of control data inputs is 
passed from a device controller and the analyzer output is received by the device 
controller for analysis, monitoring, optimization or control of the modeled 
process. 

25 

15. The device of Claim 1 wherein the plurality of selectable runtime modes 
comprises an horizon mode in which the predictive device: 
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receives an externally defined sequence of trial future data inputs proceeding 
from a current prediction mode device state; 

is operated in response to this trial sequence of data inputs producing a 

corresponding sequence of at least filtered states, and possible other 
5 state information; and 

stores the filtered states and other state information for use in reverse horizon 
mode. 

16. The device of claim 15 wherein the horizon mode is run one or more times 
10 between runs of the predictive device in the predictive mode. 

17. The device of claim 15 wherein 

a contiguous stream of external trial data inputs is passed to the predictive 
device from a device controller; and 

15 the predictions generated during horizon mode are received by the device 

controller for analysis, monitoring, optimization or control of the 
modeled process. 

18. The device of Claim 1 wherein the plurality of selectable runtime modes 
comprises a reverse horizon mode in which the predictive device uses 

20 (i) the filtered states and other state information from the most recent 

horizon mode run, and 

(ii) an output path index indicating a point in a generated sequence of 
predictions to obtain the sensitivities of the predictive device to 
changes in the trial input data sequence used by a most recent horizon 
25 mode run, based upon running the predictive device backwards. 

19. The device of claim 18 wherein the reverse horizon mode is run one or more 
times between runs of the predictive device in the predictive mode. 
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20. The device of claim 18 wherein 

the predictive device sensitivities generated during reverse horizon mode are 
received by a device controller for analysis, monitoring, optimization 
5 or control of the modeled process. 

21 . The device of claim 1 8 wherein 

a device controller specifies the output path index. 

10 22. The device of Claim 6 or Claim 7 wherein the initializing of the filter structure 
uses a Laguerre expansion. 

23. The device of Claim 1 wherein the plurality of filter units comprise: 

first and/or second order subfilters. 

15 

24. The device of Claim 1 wherein the non-linear analyzer comprises: 

a neural network. 

25. The device of Claim 1 wherein the non-linear analyzer comprises: 
20 a linear or non-linear Partial Least Squares model. 

26. The device of Claim 1 wherein the non-linear analyzer comprises: 

a hybrid parallel combination of a linear model with a non-linear 

model. 

25 

27. A computer method for modeling a non-linear, causal, multiple-input single- 
output, system or process, comprising the steps of: 
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(a) receiving and normalizing a working signal including control data inputs, 

resulting in preprocessed inputs; 

(b) aligning the preprocessed inputs, resulting in time aligned inputs; 

(c) using a plurality of filter units, filtering the time aligned inputs, at least 
5 according to time, resulting in filtered states; 

(d) generating an analyzer output based upon the filtered states, said 

generating employing a non-linear analyzer; and 

(e) converting the analyzer output to a model output that represents an 

estimate or prediction of the output of the multiple-input single-output 
1 0 dynamic system being modeled by the method. 



28. The method of Claim 27 wherein the step of receiving includes receiving a 

contiguous stream of control data inputs from an external system, said data inputs 
representing measurements from the modeled process; and 

1 5 further comprising the step of passing model output to an external device or 

method for analysis, monitoring, optimization or control of the 
modeled process. 



29. The method of Claim 27 wherein the normalizing step employs a scale and offset 
20 for each input. 

30. The method of Claim 27 wherein the converting step employs a scale and offset. 

3 1 . The method of Claim 27 further comprising the step of configuring including the 
25 steps of: 

(a) setting overrideable parameters, including scale and offset settings, for the 
receiving and normalizing step; 



WO 99/17175 



PCT/US98/20295 



-38- 

(b) setting overrideable delay times, for the aligning the preprocessed inputs 

step; 

(c) training the plurality of filter units used in the filtering of the time aligned 

inputs step; 

5 (d) training the non-linear analyzer to optimally map the filtered states to the 

analyzer output in the generating analyzer outputs step; and 

(e) setting overrideable parameters, including scale and offset settings, for the 
converting the analyzer output to a model output. 

10 32. The method of Claim 3 1 wherein the step of training the filter units includes: 

employing a filter training unit to configure, in a practical manner 
allowing relatively simple override by an operator, the plurality of filter units 
by using input training data for control data input, and by using a 
corresponding measured process output for target model output training data, 
15 and by: 

(a) setting an initial estimated dominant time constant associated with each 
input training data; 

(b) for each input training data, initializing a filter structure based on the 
input training data's initial estimated dominant time constant; 

20 (c) for each target device output training data, determining a corresponding 

target analyzer output training data 

(d) mapping input training data through the delay units and the filter 
structure, resulting in a plurality of training filter state vectors, creating a 
vector trained to linearly map training states to the corresponding target 

25 analyzer outputs, resulting in a multi-input, single-output state space 

system consisting of a block structure of single-input single-output sub- 
systems, one for each filter unit; 
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(e) applying a method of order reduction to each single-input, single-output 
sub-system, resulting in a set of reduced order single-input, single-output 
sub-systems; 

(f) reducing each reduced single-input, single-output sub-system to a first 

5 order system to determine an updated estimated dominant time constant 

for each input training data; 

(g) repeating steps (a) through (f) for one or more iterations using the updated 
estimated dominant time constants in place of the initial estimated 
dominant time constants; and 

10 (h) configuring each filter unit using the corresponding reduced order single- 

input, single-output sub-system. 



33. The method of Claim 31 wherein the step of training the filter units includes: 

employing a filter training unit to configure, in a practical manner 
15 allowing relatively simple override by an operator, the plurality of filter units 

by using input training data for control data input, and by using a 
corresponding measured process output for target model output training data, 
and by: 

(a) setting an estimated dominant time constant associated with each input 
20 training data; 

(b) for each input training data, initializing a filter structure based upon the 
input training data T s estimated dominant time constant; 

(c) for each target device output training data, determining a corresponding 
target analyzer output training data 

25 (d) mapping input training data through the delay units and the filter 

structure, resulting in a plurality of training filter state vectors, creating a 
linear or non-linear analyzer trained to map training states to the 
corresponding target analyzer outputs, where said analyzer has a set of 
internal states which are linearly dependent on a current filter state vector, 
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resulting in a linear multi-input, multi-output state space system mapping 
input training data to said internal analyzer states, where said linear multi- 
input, multi-output state space system consists of a block structure of 
single-input multi-output sub-systems, one for each filter unit; 

5 (e) applying a method of order reduction to each single-input, multi-output 

state space system, resulting in a set of reduced order single-input, multi- 
output sub-systems; and 

(f) configuring each filter unit using the corresponding reduced order single- 
input, multi-output sub-systems. 

10 

34. The method of claim 33 wherein the non-linear analyzer step employs a neural 
network and the internal states are the summation values of the neural network 
hidden layer. 

15 35. The method of claim 33 wherein the linear or non-linear analyzer employs a 
Partial Least Squares model and the internal states are latent variable scores. 

36. The method of Claim 32 or Claim 33 wherein the step of balancing employs a 
Hankel model reduction procedure. 

20 

37. The method of Claim 32 or Claim 33 wherein the step of balancing employs an 
Internally Balanced model reduction procedure. 

38. The method of Claim 32 or Claim 33 wherein the step of balancing employs an 
25 iterative application of any of a variety of model reduction procedures including 

a Hankel model reduction procedure and an Internally Balanced model reduction 
procedure. 
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39. The method of Claim 27 wherein, in a predictive mode, the step of receiving 
includes receiving a contiguous stream of control data inputs from an external 
system, said data inputs representing measurements from the modeled process; 

said receiving of data inputs occurs once per base sample; and 

5 the steps (a) through (e) are performed once per base sample. 



40. The method of Claim 39 further comprising the step of passing the model output 
to an external system for analysis, monitoring, optimization or control of the 
modeled process. 

10 

41 . The method of Claim 27 wherein, in an horizon mode, steps (a) through (e) are 
iterated multiple times wherein, at each iteration, the filtered states and other 
state information are stored for later use. 

15 42. The method of Claim 41 further comprising the step of passing the model output 
at each iteration to an external device or method for analysis, monitoring, 
optimization or control of the modeled process. 

43. The method of Claim 27 wherein, in a reverse horizon mode , steps (a) through 
20 (e) are iterated multiple times in reverse order, wherein, at each iteration 

the steps employ stored information from the method of Claim 41; and 

the result of this method is to produce sensitivities of a predictive model 
output for a specified iteration with respect to changes in the 
predictive mode received data at each previous iteration, where said 
25 specified iteration is provided to the method by an external system. 
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44. The method of Claim 43 further comprising the step of passing the calculated 
sensitivities at each iteration to an external device or method for analysis, 
monitoring, optimization or control of the modeled process. 

5 45. The method of Claim 32 or Claim 33 wherein the step of initializing the filter 
structure uses a Laguerre expansion. 

46. The method of Claim 27 wherein the plurality of filter units comprise: 

first and/or second order subfilters. 

10 

47. The method of Claim 27 wherein the non-linear analyzer comprises: 

a neural network. 

48. The method of Claim 27 wherein the non-linear analyzer comprises: 
15 a linear or non-linear Partial Least Squares model. 

49. The method of Claim 27 wherein the non-linear analyzer comprises: 

a hybrid parallel combination of a linear model with a non-linear 

model. 
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